Get the statistics based on the application name and IP address

  • Jump to comment-1
    ibrar.ahmed@percona.com2022-08-08T15:21:06+00:00
    While working on pg_stat_stements, I got some questions from customers to have statistics by application and IP address. I know that we are collecting the statistics by query id, user id, database id and top-level query. There is no way to collect the statistics based on IP address and application name. That's possible that multiple applications issue the same queries with the same user on the same database. We cannot segregate those queries from which application this query comes. I know we can this in the log file with log_line_prefix, but I want to see that aggregates like call count based on IP and application name. I did some POC and had a patch. But before sharing the patch. I need to know if there has been any previous discussion about this topic; by the way, I did some Googling to find that but failed. Thoughts? -- Ibrar Ahmed. Senior Software Engineer, PostgreSQL Consultant.
    • Jump to comment-1
      rjuju123@gmail.com2022-08-08T17:11:40+00:00
      Hi, On Mon, Aug 08, 2022 at 08:21:06PM +0500, Ibrar Ahmed wrote: > While working on pg_stat_stements, I got some questions from customers to > have statistics by application and IP address. > [...] > name. I did some POC and had a patch. But before sharing the patch. > > I need to know if there has been any previous discussion about this topic; > by the way, I don't think there was any discussion on this exactly, but there have been some related discussions. This would likely bring 2 problems. First, for now each entry contains its own query text in the query file. There can already be some duplication, which isn't great, but adding the application_name and/or IP address will make things way worse, so you would probably need to fix that first. There has been some discussion about it recently (1) but more work and benchmarking are needed. The other problem is the multiplication of entries. It's a well known limitation that pg_stat_statements eviction are so costly that it makes it unusable. The last numbers I saw about it was ~55% overhead (2). Adding application_name or ip address to the key would probably make pg_stat_statements unusable for anyone who would actually need those metrics. [1]: https://www.postgresql.org/message-id/flat/604E3199-2DD2-47DD-AC47-774A6F97DCA9%40amazon.com [2]: https://twitter.com/AndresFreundTec/status/1105585237772263424